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Abstract 

A succesful method to describe the asymptotic behavior of a discrete time stochastic process 
governed by some recursive formula is to relate it to the limit sets of a well chosen mean 
differential equation. Under an attainability condition, convergence to a given attractor of the 
flow induced by this dynamical system was proved to occur with positive probability (Benai'm, 
1999) for a class of Robbins Monro algorithms. Benai'm et al. (2005) generalised this approach 
for stochastic approximation algorithms whose average behavior is related to a differential 
inclusion instead. We pursue the analogy by extending to this setting the result of convergence 
with positive probability to an attractor. 
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1 Introduction 



1.1 Settings and bibliography 

Stochastic approximation algorithms were born in the early 50s through the work of Robbins and 
Monro [20] and Kiefer and Wolfowitz [16]. Consider a discrete time stochastic process (x„)„>o 
defined by the following recursive formula: 

x n+ i -x n = 7„+i [F(x n ) + U n+1 ) , (1) 

where F : R m — > R m is a lipschitz function, (j n )n is a positive decreasing sequence and (U n ) n 
a sequence of M m -valued random variables defined on a probability space P). In order to 
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describe the limit behavior of the sample paths (x„(w))„, a natural idea is to compare them to the 
solution curves of the dynamical system induced by the ordinary differential equation 

This is the celebrated method of ordinary differential equation (ODE) which was introduced by 
Ljung in [19]. Heuristically, one can think of ((TJ as a kind of Cauchy-Euler approximation scheme 
for numerically solving |f5|) with step size (j n )n and an added noise (U n ) n . We could reasonably 
expect that, under appropriate assumptions on (j n )n and if the noise (U n ) n vanishes, the asymptotic 
behaviors of (x n ) n and the ODE are closely related. 

Thereafter, the method was studied and developed by many people (see Kushner and Clark |17| . 
Benveniste et al [9], Duflo [12J or Kushner and Yin [18J ) . Originally, only simple dynamics were 
considered, for example the negative of the gradient of a cost function. However, it appears in 
several situations, for example, learning models or game theory, that the corresponding vector field 
may be more complex. 

Benaim and Hirsch have conducted, in a series of papers (essentially |5j and [5]), a thorough study 
of this method. They proved that the asymptotic behavior of stochastic approximation process can 
be described with a great deal of generality through the study of the asymptotics of the ODE . One 
of the main results is the characterization of limit sets (x n (u>)) n via the flow induced by F, in the 
sense that, almost surely, these sets are compact, invariant and contain no proper attractor for the 
deterministic flow (this is the notion of internally chain recurrence in the sense of Conley [TT] . see 
also Bowen [10]). 

Now, let F : R m =} R m be a sufficiently regular set-valued map and consider some discrete time 
stochastic processes (x n ) n >o defined by the following recursive formula: 

U n+ i e j n +iF(x n ), (3) 

where (7„) n is a positive decreasing sequence and (U n ) n a sequence of E m -valued random variables 
defined on a probability space (fi, "J, P). 

In |6J, Benaim, Hofbauer and Sorin have generalized the ODE method to the algorithms given by 
([3| and extended the characterization of limits set in the sense that they are again, under certain 
assumptions on the step size and the noise, connected and attractor free for the set-valued dynamic 
induced by the differential inclusion 

dx , . 

This extension allows us to extend this technic to much wider class of problems arising, for exemple, 
in economics or game theory (see Benaim, Hofbauer and Sorin [7]). 

In this paper, we pursue the analogy between the ODE method and the differential inclusion method. 
The aim is to extend to the case of differential inclusions, the result of Benaim (see theorem 7.3 in 
[4]) which guarantees that, under certain assumptions on the step size and the noise, the stochastic 
approximation process converges with positive probability to a given attractor of the set-valued 
dynamical system induced by F. 

The organization of the paper is as follows. In section 1.2, we define a standard set-valued map and 
introduce the crucial notion of attainability so as to state a simple version of the main result. In part 
2, we introduce the different notions of internally chain transitivity, asymptotic pseudotrajectories 
and perturbed solutions. Our main assumption (hypothesis I2.6P is given, the convergence result is 
stated in full generality and we define a generalised stochastic approximation process which satisfies 
the above assumption. An example of adaptive learning process to which our results may be applied 
is given in section 3. Finally, the proof of a crucial result needed in our study is postponed to part 
4. 



2 



1.2 The main result, a simple version 

In the following, M C R m is a compact set. 

Definition 1.1. [Standard set-valued map] A correspondence F : R m =4 R m is said to be standard 
if it satisfies the following assumptions: 

• for any x G M. m , F(x) is a non empty, compact and convex set ofW n , 

• F is closed, which means that its graph 

Gr{F) := {(x, y) £ K m xR m \ ye F(x)} 

is closed, 

• there exists c > such that 

sup \\z\\ < c(l + M). 

Under the above assumptions, it is well known (see Aubin and Cellina [1]) that |(4]) admits at least 
one solution (i.e. an absolutely continuous mapping x : R — > K m such that x(t) £ F(x.(t)) for amost 
every t) through every initial point. 

We call S x the set of solutions with initial condition x(0) = x. The set-valued dynamical system 
induced by the differential inclusion will be denoted <£> = ($t)teK- To any x £ R m , it associates the 
non empty set 

$ t (x) := {x(i) | xgS,}. 

Finally, 5$ is the set of every solution curves. In order to understand the main result, recall some 
classical definitions about the set-valued dynamic. 

Definition 1.2. A non empty compact set A c W n is called an attractor for <&, provided it is 
invariant (i.e. for all there exists a solution x to {5p with x(0) = x and such that x(M) cij 

anrf i/iai there is a neighborhood U of A with the property that, for every e > 0, there exists t e > 
such that 

$t(U) C N"(A) 

for all t >t e . 

Definition 1.3. Let A C W n be an attractor for the set-valued dynamical system. The basin of 
attraction of A is the set 

3(A) := {x e M m : w«(s) C A}, 
where w$(x) = Ht>o ^[(^[(a^) *s ifte omega limit set of the point x. 

Now consider a discrete time stochastic process (x n ) n in M defined by ([3]), and satisfying the 
following assumptions : 

(i) For all c> 0, 

^ e~ c/7 " < co, 

n 

(m) (U n ) n is an uniformly bounded and 

E((7„ +1 | 5 n ) =0, 

(iii) F is a standard set-valued map. 
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Set r„ := Yli=i~fi an d m(t) :— sup{j | Tj < t}. We call X the continuous time affine interpolated 
process induced by (x n ) n and 7 the piecewise constant deterministic process induced by (j n )n- 

X(n + s) = Xi + s— -, for s G [0,7i+i] and 7(7? + s) := for s G [0,7,+i[, 

7i+l 

and consider its limit set 

:= f){X( S ) : s>t}. 

t>o 

The attainability condition is crucial to show that X converges with positive probability to a given 
attractor 

Definition 1.4. A point p G M is attainable if, for any t > and any neighborhood U of p, 

P(3s > t : X(s) G U) > 0. 

We call Att(X) the set of attainable points by X. The following statement is a special case of our 
main result, Theorem 12, 141 

Theorem 1.5. Let A C M be an attractor for $ with basin of attraction ^{A) . If Att(X)n'B(A) ^ 
then 

P(H{X) C A) > 0. 



2 Convergence with positive probability 



2.1 Set-valued dynamical systems relative to a differential inclusion. 

We recall here some definitions and results due to Benaim et al (see |6j). 

Let F : W n R m be a standard set-valued map and $ be the set-valued dynamical system 
associated to the differential inclusion 

dx „, . 

dt G FW (5) 
The notion of internally chain transitive set (ICT set) was introduced by Benaim and Hirsch in [5] 
to analyse certain perturbations of flow relative to an ODE. This is an extension of the notion of 
chain recurrence due to Conley [H]. The concept of ICT sets was extended to differential inclusions 
by Benaim et al. in [6]. 

We refer to this last reference for an accurate description of ICT sets. However, we only need 
the equivalent definition of these objets here and therefore we will say that L is internally chain 
transitive for <I> if and only if it is invariant, compact and the restricted set- valued dynamical system 
$|l admits no proper attractor (i.e. no attractor distinct from L). 

Theorem 2.1 (Benaim et al, 2005). Let L be an internally chain transitive set and A be an attractor 
for $ with basin of attraction 'B(A). Then 

B(A)nl/ => LcA. 
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The space C(R + ,IR m ) of continuous paths, endowed with the metric 

<u)-x'(u)\\,l 



00 1 / 
D(x,x') := ^2 ^fc min SU P H x ( 

k=1 1 \u£{0,n] 



is complete. A continuous map X : M + — > M" 1 is an asymptotic pseudotrajectory (APT) of the 
set-valued dynamical system ($t)t>o if lim 4 ^ +oc D (X(t + •), 5$) = 0. In other terms, for any 
T>0, 

lim inf \\X(t + -)-z(-)\\ [0T] =0, 

where || • ||[o,t] denotes the uniform norm on [0, T]. Heuristically this means that, for any T > 0, 
the curve joining X(t) to X(t + T) shadows the trajectory of some solution with arbitrary accuracy, 
provided t is large enough. 

A fundamental property of asymptotic pseudotrajectories is that, if X is a bounded APT, then its 
limit set £>(X) is internally chain transitive. Consequently, as a consequence of Theorem 12, 1^ we 
have 

Corollary 2.2. Let X be an asymptotic pseudotrajectory of the set-valued dynamical system and 
A an attractor for <£>. If L(X) meets the basin of attraction of A, then it belongs to A. 

Let 8 be a positive real number. Then F 5 is the set-valued map defined by 

F s (x) := {y j 3z e B(x, 5) such that d(y, F{z)) < 5} . (6) 

Definition 2.3. A continuous function y : M+ — > M' Tl is a (S(-),U(-)) -perturbed solution of the 
differential inclusion J2p if 

(i) y is absolutely continuous, 

(ii) there exists a function 6 :]0, +oo[— > R!j_ such that S(t) and, for almost every t > 0, 

^-u(t)eF^(ym 

(Hi) the function U : R+ — » K. m is locally integrable and such that, for any T > 0, 



lim 

t — >+oo 



U(u)du 



= 0. 

[0,T] 



We recall the following theorem due to Benaim et al (see [fi] Theorem 4.2). 

Theorem 2.4 (Benaim et al, 2005). Any bounded perturbed solution of the differential inclusion 
(f5p is an asymptotic pseudotrajectory of the set-valued dynamical system <I>. 



2.2 A deterministic result 

Let X be an APT of the set-valued dynamical system (&t)t>o- For any T > 0, we define the 
quantity 

dx(T):= sup inf ||z(-) - X(kT+ -)\\ l0T] . 
The following lemma is the extension to differential inclusions of Lemma 6.8 in Benaim[4]. 
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Lemma 2.5. Let A C K m be an attractor for the set-valued dynamical system $, with basin of 
attraction Ti{A). Then, for any compact set K C 'B(A), there exists positive real numbers a(K) 
and T(K) such that 

(X(0) G K and d x (T(K)) < a{K)) Z(X) C A. 

Proof. Let W be an open set such that 

AUK c W c WCB(A). 

There exists a > such that Ns a (A) C W and N a {K) c W . By definition of an attractor, there 
exists then a positive number T (which depends on a and W) such that 

$[T,+oo[(VlO CJV a (A). 

Assume now that X(0) G K and dx{T) < a. There exists a solution z 1 which shadows X on [0, T]; 
in particular, 

z^O) G N a (X(0)) C W and X{T) E A^z^T)). 
By definition of T, z^T) € N a (A), which means that X(T) G A^ Q (A) C W. 

By a recursive argument, we show that the sequence (X(kT))k>\ belongs to the set N^ a {A). Assume 
that X(kT) £ N2 a (A). Then, there exists a solution z k+1 (-) which is a-close of X(kT+-) on [0,T]: 
in particular, 

z k+1 (0) G N a (X(kT)) C N 3a (A) c W and X(kT + T) e N a (z k+1 (T)) c JV aa (>4). 
Consequently, the limit set is contained in C 55 (A) . Hence, C A by Corollary 12.21 



2.3 Stochastic processes 

In the following, (X(t)) t >o will be a continuous time ]R m -valued stochastic process, adapted to some 
non decreasing sequence of cr-algebras (Jt)t- 

Hypothesis 2.6. There exists a map u> : — > R+ smc/i £/ia£, /or ant/ a > and T > 0, 

Pjsupinf || Z (0-X(s + -)|| >a|5i J <w(i,o,T) L^ +00 0. (7) 
\s>t zes* 1 ' J / 

Recall that Att(X) is the set of attainable points by X. 

Theorem 2.7. Let A be an attractor and (X(t))t>o be an adapted process satisfying hypothesis ] 2. 61 
Then, if Att(X) n T>(A) ^ 0, we have 

F(L(X) C A) > 0. 

Proof. We adapt the proof of Theorem 7.3 in Benaim [4J. Let U be an open set included in ¥> (A) 
and call K = U. There exists a(K) and T[K) such that 

X(0) G K and d x {T{K)) < u(K) L{X) c A 

Let t > such that w(t, a, T) < 1 and denote t„(k) = -M^ for n and k in N. We define the stopping 
time 

Tn := inf {t n (k) | X(t n (k)) G 17, t„(fc) > f} . 
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On the intersection of the events {r„ < 00} and {sup s>Tji inf ze s<„ ||z(-) — X(s + -)||[o,t] < a } the 
set £>(X) is included in A. Consequently, we have 

F(L(X)cA) > V Efpfsup inf \\z(-) - X(s + -)\\ [0 , T] < a | ? t „ (fc) V =t „ (fc) ) 
fc>[2»t]+i V V*>^ zeS * ' / / 

> (l-w(t„(fc),a,T))P(r„=t n (fc))>(l-«(t,a,T))P(r n <+cx) 1 

fc>[2"t] + l 

since u>(t n (k),a,T) < cu(t,a,T), Vfc > [2 n t] + 1. On the other hand, the sequence of events 
{i~ n < +00} is increasing and we have 

lim T {r n < +00} = {3s > t I X(s) G U}. 

Now take an attainable point p G 'B(A) and U a neighborhood of p, such that U C 15(A). We have 

P (H(X) ci)>(l- uj(t, a, T))P (3s > t | X(s) £ U) > 0, 

and the proof is complete. ■ 

Now, we consider a compact set M C M m and a standard set-valued map F : M. m =t R m . Let 
T > 0. Denote <!> T (M) := UsefOT] ^s(-^0> 11-^11 = su Pa;e$ T (Af) su PyeFO) \\u\\ an d let us define the 
compact set 

K C := K muc) = {y e Lip([0,T],R m ) \ Lip(y) < \\F\\ + C + 1 , y(0) e M} , 
where C is a positive constant. 

Remark 2.8. Kc is appropriate to our situation since it contains every solution curve, restricted 
to an interval of length T and any (S(-),U(-))- perturbed solution of the differential inclusion, with 
sup t6 [o,T] U(t) < C and 5 < 1, 

For 8 G [0, 1], let us define the set-valued application (with the convention A = A): 

K 5 :K C ^K C , z^A 5 (z), (8) 

where y G A 5 (z) if and only if there exists an integrable h : [0,T] — ► K m such that h(u) G 
F s {z(u)) Vw G [0,T] and 

y(r) = z(0) + / fe(u)du, Vr G [0,T]. 
Jo 

Remark that Fix(A) := {z G Kc | z G A(z)} is the set of the restrictions on [0,T] of the solutions 
curves starting from M, which we denote 5[o,t] from now on. Additionaly, we call d[o,r] the distance 
associated to the uniform norm on [0,T]. The following lemma is an immediate consequence of 
Corollary 14. 11[ proved in the last section. 

Lemma 2.9. Let C > and a > 0. There exists e > and <5o > sttc/i i/iai, for any 5 < 5q 

d[o,T](z, A<5 ( z )) < e ^o.TjO^Sp/r]) < a- 
As a consequence, we obtain the following crucial result. 
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Proposition 2.10. Assume that there exists a function S : R+ — > M+ converging to zero and an 
uniformly bounded random process (U(t))t>o such that (X(t))t>o is almost surely a bounded (S, U)- 
perturbed solution of the differential inclusion {2Jj and such that X(0) G M, Then, ifU satisfies the 
following property 



sup 

. s>t 



U {u)du 



[0,T] 



>e| ft <w(t,e,T) [ t ^ +00 0, 



(9) 



X(-) satisfies hypothesis \ 2. 6\ and Theorem \2.H\ may be applied. 



Proof. First, X(-) is almost surely an asymptotic pseudotrajectory of the set-valued dynamical 
system by Theorem [231 For Lipschitz (classical) dynamical systems, the fact that hypothesis 12.61 is 
checked follows from an application of Gronwall lemma (see Benai'm [4], section 7). However, this 
is no longer possible in our settings and this is the reason why we need Lemma l2~Ul By assumption, 
we have almost surely 



dX(t) 
dt 

Let T > 0. For any r G [0,T], 



U(t) G F m (X(t)), for almost every t > 0. 



X(s + t) - 



S + T 



U{u)dueX{s)+ / F 5{ - s \X{s + u))dv 



Hence, d [0jT] (X(s + -),A s ^(X(s + •)) < f° + ' tf(u)«* 



[0,T] 



and 



sup du 

s>t 













) > e 1 ft) 


< I 


1 (sup 


J U{u)ds 


> e | ft J 






\s>t 




[0,T] / 



< w(t,e,T). 



Now let a > 0. By Lemma 1231 there exists e > (which depends on T and a) such that, for t large 
enough and s > t, 



d [0 .T] (X(s + •), 5 [0 ,t]) > « d[o,T] (^(a + •), A 5(t) (X(s + •))) > 
Consequently, for these choices of t and e, 



p(supd [ o, r ](X(s + -),5 [0> r]) >«|ft] < P f supd [0 , T] + •), A 4(t) (X(s + •))) >e|ft 

\s>* / \s>t y 



< 



sup 



y J7(«)ds 


> e | ft j 




[0,21 / 



< iu(t,e(a,T),T), 



and the proof is complete. 
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2.4 Convergence of Stochastic approximation algorithms 



We introduce here a class of stochastic approximation processes which generalize the Robbins-Monro 
algorithms. Under some assumptions on the step size and the noise, we prove that hypothesis 12.61 
is verified and that the conclusion of Theorem 12.71 holds. 

Definition 2.11 (Generalised stochastic approximation process) . Let (U n ) n be an uniformy bounded 
W l -valued random process, {j n )n a deterministic positive real sequence and {F n ) n a sequence of set- 
valued maps on R m . We say that (x n ) n is a generalised stochastic approximation process relative 
to the standard set-valued map F if the following assumptions are satisfied: 

(i) we have the recursive formula 

x n+1 -x n - ~f n+1 U n+1 G j n +iF n (x„), 

(ii) the step size satisfies 

E7„ = +oo, lim7„ = 0, 
n 

n 

(Hi) for all T > 0, we have almost surely 



lim sup ■ 

n — >+oo 



fc-1 



f + 1 



fc-1 



k such that 7, < T > = 0. 



(iv) For all n > 0, x n £ M. 
(v) For any 5 > 0, there exists no G N such that 

Vn > n , F n (x n ) C F s (x n ). 

In the following we will call X = (X(t)) t the continuous time afBne interpolated process induced 
by a given generalised stochastic approximation process (x n )„. 

Proposition 2.12. The process X is almost surely a (5, U)-perturbed solution, for some determin- 
istic function 5, and U the piecewise constant continuous time process associated to (U n ) n : 



U(t) := U n+ i, V* G [t„,t„ 



+H 



Proof. By straightforward computations (see the proof of proposition 1.3 in Benai'm et al. [6]), it 
is not difficult to see that almost surely, (X(t)) t is a perturbed solution associated to U and 

S(t) := inf {S > | r„ > t =► F n (x n ) C F 5 (x n )} +7(t) (u{t) + c ^1 + sup F(x)X\ , 

which obviously converges to 0. ■ 

Remark 2.13. The condition is equivalent to 

H+iUi+i >e\1 n \< w(n, e, T) | 

i—m / 

and we use the notation A(n,T) := sup n<fc<m ( TTi+T ) Yli=n 7i+i^+i * n sequel 



sup sup 

m>n m</c<m(r m +T) 
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Our main result is now stated in full generality. 

Theorem 2.14. Let [x n ) n be a stochastic approximation algorithm such that (U n ) n satisfies 110\) . 
Then, if A is an attractor relatively to F , we have 

Att(X) n 'B(A) ^ => P(£(I)c4)>0. 

Proof. By Proposition l2.12l the conditions requested to apply Proposition ^. 101 are satisfied. Hence, 
hypothesis 12.61 is checked and the result follows directly from Theorem 12.71 ■ 

In the particular case where (U n ) n is a martingale difference: E(U n +i \ $n) = 0, ifTOl is checked 
under simple assumptions on the noise and step size. 

Proposition 2.15. Let (U°) n be a martingale difference noise (not necessarily bounded) and as- 
sume that one of the following assumptions is satisfied: 

1) There exists some q > 2 such that 

J2^ +q/2 < +oo and supE (||C/°|| 9 ) < +oo. 

n 

2) There exists a deterministic sequence (M n ) n such that M% — o ((•y n logn) -1 ) and, for any 
n G N, 



V0 G K m , E (cxp {(9, [7° +1 » | 3g < exp ^\\9\\ 2 j , 

then $19)) is checked. 

Proof. For the first point, we refer the reader to Benai'm [3J. Now for the second, let 9 G K m and 
consider the process {Z n (9)) n defined by 

Z n {6) := exp fa (9, 7i t/°) - ff £ 7 ?mA . 

\i=i »=i / 

(Z n (9)) n is a supermartingale by assumption. Hence, if we denote S n := Y^lZn li+i^-i+i an ^ 
m„ := m(r n + T), for any /3 > 0, 



fe-i 



sup 

n<fc<m Tl 



?,^7,+ l^ + l) >/3|?n = 



< 



,n<k<m n Z n (9) 



sup Z fc (0) > Z n (0)exp{/?- 

n < /c < m n . 



I I 5*71 



< exp - /3 



Let e G {ei, e m , —e\, -e m }. We have 



fe-i 



sup ( e 

n<Km, 



sup 

. n<fc<m T1 



< 



exp 



— £ 

2S n 



Se_ 



fc-1 



> ^2 
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sup 

. n<k<m n 



fc-1 



> e | 3 n ) < 2m exp ( 



— e 



Let us introduce e„ := 7nM 2 log"- Then, since 53t=n — we nave 



sup 

. n</c<m T7 



i+1 



> e | 9>i I < 2mexp 



/ — e 2 \ogn 
\2Tsup k>n e k 



Since sup fc> „ Ek — > 0, the application (n,e,T) i— > u(n,e,T) :— 2d ^ m>J1 exp ^ 
verges to as n tends to infinity and the proof is complete. ■ 
The following is a useful consequence of this statement. 



— e log n 
2Tsup fc> „ e k 



con- 



Corollary 2.16. Assume that U n can be written U n = J7„ + U} 17 where 

• (Un)n a martingale difference noise satisfying one of the assumptions in the Proposition ^. 15\ 

• J75j) is satisfied for (U^) n . 
Then ^10]) is satisfied for (U n ) n 



3 Application to the Markovian fictitious play learning model 

We discuss here a Markovian strategy in a two-person game and study the induced dynamic. The 
model is studied by Benai'm and Raimond in [8] and was inspired by a so-called pairwise comparison 
dynamic introduced in Benai'm et al. [7]. 

3.1 The model 

The motivation is the following. We assume, in the initial model, that the information situation 
is the same as in the smooth fictitious play developped by Fudenberg and Levine (see [14] and 
[15] ) where the considered player uses a best response strategy against the empirical moves of 
his opponent, with respect to a smooth perturbation of the payoff function. A player adopting a 
smooth fictitious play strategy needs to be informed of his payoff function as well as the moves of 
his opponents up to this stage. 

For some reason (computational limits, restricted available strategy...), we consider here that the 
set of moves he can play at some stage is a subset of his action set, which depends of the last action 
taken. 

More formally, we consider a two players game in normal form. / and L are the (finite) set of moves 
of respectively player 1 and player 2. These sets are of the form 

1= {l,...,™ 1 }, L= {l,...,m 2 }; 

The maps (U 1 , U' 2 ) :IxL^IxI denote the payoff (or utility) functions of players. The sets of 
mixed strategies available to players are denoted X = A(7) and y = A(i), where 

A(J) :=ix= (si,..., a? ro i) e M.f \ £ a* = 1 L 

I i=l,..,m 1 I 



li 



and analogously for A(L). We will use the classical abuse of language for y G Y: 

U 1 {i 1 y) = Y J U 1 {i,l)yi, 

leL 



For x G X, y G y, we call br 1 (y) := Argmax x ^xU 1 (x,y) and br 2 (x) = Argmax ye yU 2 (x,y). We 
assume that a given game is played repeatedly and call X n (resp. Y n ) the move of player 1 (resp. 
player 2) at stage n. The empirical distribution of moves up to stage n is denoted x n (resp. y n ) 

Let Mq be an irreducible matrix, reversible with respect to its invariant probability distribution ttq. 
Mq represents the possibility or not to play an action depending on the last move: player 1 will be 
able to play action j after having played i if and only if Mq > 0. For ncN and y G Y, let us 
define the Markov matrix 

M i f . • ) = f Mfafiexp (U 1 (i,y) - U 1 (j, y)) + ) iii^j, 
where (/?„)« is some positive deterministic sequence. 

Definition 3.1. A Markovian fictitious play (MFP) strategy for player 1, associated with {Pn)n 
and (Mq ,ttq) is a strategy a such that, for any n£N, 

P CT (X n+1 = j | y„) = Ml{X n ,j;y n ). 

From now, we assume that both players use a Markovian fictitious play strategy, associated to Mg 
and {P^) n (p = 1,2). Let us introduce the random sequences 

1 - 

V n ■■= {Sx n ,5 Yn ) and v n := - } Vi = (x n ,y n ) . 

n L — ' 

i=l 

Benaim and Raimond proved that there exists positive constants A p , p = 1,2 (see section l3?2|) such 
that, if ft? — A p logn with A p < A p , then (v n ) n is a generalised stochastic approximation process, 
taking values in A = X x y, with step size j n = 1/n, relatively to the map F(x, y) = {(a, /3, | a G 
br 1 (y) — x, /3 € br 2 (x) — y}. Note that the corresponding differential inclusion is the best response 
dynamic: 

(x,y) G (br 1 (y),br 2 (x)) - (x,y) 

We call Att(v) the attainability set of the discrete process (v n )„. Recall that p G Att(v) if and only 
if, for any neighborhood N of p and any no G N, 

P {3n > n | v n G N) > 0. 

Proposition 3.2. Assume that the exploration matrices have positive diagonal values. Then Att(v) 
is equal to the whole state space A 

Proof. This is a consequence of the definition of an exploration matrix. From any instant n, any 
player i, any move a* and any positive integer p, player i will play action a* p times in a row with 
positive probability. ■ 

Theorem 3.3. There exists positive values A p , p — 1,2 (which depend only on the payoff functions 
and the exploration matrices ) such that, if agent p plays accordingly to a MFP strategy with /3 P = 
A p log n and A p < Ap, then 

P(%)n)cA) >0, 

for any attractor A for the set-valued dynamical system induced by F. 



12 



In particular, a strict Nash equilibrium is always an attractor for the best response dynamic. Hence 

Corollary 3.4. Let v = (x,y) be a strict Nash equilibrium. Then , under the assumptions of 
Theorem[EM 

3.2 Proof of Theorem EOl 

Note that the Markov matrix M*(-, •; y) defined in the previous section is reversible with respect to 
its invariant distribution n n [y] : 

(4[y])i = ( 7 r 1 ) i exp ((3U 1 (i,y)). 

Also, considering an irreducible Markovian matrix M and its invariant probability measure n, one 
can define the pseudo inverse Q of M, characterized by 

Q(I - M) = (I- M)Q =1-11, Q 1 = 0, 

where II is the matrix defined by — n(j). Let us call tt^ and Q n (respectively 7r^ and Q n ) 

the invariant probability and the pseudo inverse of the matrix M\ := M„(-, -;y n ) (resp. M% := 
M^(-, ■;x n ))- We now define the energy barrier of the exploration matrix Mg with respect to the 
payoff function U 1 . Let Tij be the set of admissible paths from i to j in the graph associated to 
Mq-. 7 = (i = io,ii,..,i n — j) is admissible if Mq (ik, ik+i) > 0, k = 0, ..,n— 1. Then, denoting for 

v eY, 

Elev(i,j; y) :— min {max{ — U 1 (k, y) \ k 6 7}, 7 6 r,,j } . 

we call 

I7 1,# (y) := max{Elev(i,j;y) + U 1 ^) + U^^y) - maxU 1 ^, y)} , U 1 '* := maxC/ 1 ' # (y). 

■yeY 

Obviously, the quantity U 2, # is defined analogously. Let A 1 := 1/2U P '#. The following proposition 
can be easily derived from the proof of Proposition 4.4 in Benaim and Raimond [8]. 

Proposition 3.5. Assume that the sequence j3 n satisfies 

— >n 0, for some < A 1 < A 1 . 

A L log n 

Then there exists a positive deterministic sequence {u n ) n such that 

a) < u n ^ n = 0, 

b) hm„ \TLn + i - n^| < u n ->•„ 0, 

c) hm„ |<3n+i -Qn\ <!i n -»J. 
Let # n be the random variable 

we have the following 

v n+1 - v n = — 3_ + V^+i) = — — (-v n + 9 n + U n+ i) , 

n + 1 71 + I 

with 

= V»+i - 6>„ = (<5x„ +1 - ^[y„],5y n+1 - tt^[x„]) . 
There remains to prove that the noise U n checks the property ([TOf . 
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Lemma 3.6. Assume that the sequences {/3%) n (p = 1,2) satisfy the assumption of Proposition ^^. 
Then the random sequence A(n,T) relative to (l/n) n and (U n ) n satisfies the identity fiflj) . 



Proof. We call ( n +i the term 5x n+1 — T^n[y n ], and we aim to prove that the inequality (jTUJ) is 
checked for (Cn)n- Clearly, this will imply that (jTD)l is also satisfied for (U„) n . Without loss of 
generality, we therefore denote A(n, T) := swp n<k<m ^ Tri+T < ) 
First of all, Cn+i can be written 



Efe-i i > 



Cn+i = #x„ + i (/d - n„) = S Xn+1 (Q n - M n Q n ) . 
There is then a natural decomposition: 

Cn+i = {Sx n+1 Q n ~ S Xn M n Q n ) + (S Xn M n Q n - S Xn+1 M n Q„ 
The first term is a martingale difference, bounded by \Q n \ (up to a constant). Hence it satisfies the 

log n 



assumption 2) of Proposition 12. 15| with M n — ""^ 



Now, for the second term, we have 
fc-i 



Yl "TT ( 6x i M i& - 5 x i+1 MiQi) < ~~~7 (Sx i+1 M i+1 Q i+1 - 6 Xi+1 MiQi) 

i— n %—n 

+ f^i^XiMiQi-^—Sx^Mi+rQi+^+T sup ^ 

f-^ V z z + 1 / n<i<k-l I 



since 



Etn TpftSxtMiQi < max{|Q,|/i | n < i < k - 1} £<=»,..,*-! V*. 



The first term on the right side can be written 

fc-i 



-^-r^x i+1 (Qn+i - Q« + n„+i - n n ) . 



and is bounded by the quantity Tmax{|Qi+i — Qi\ + |IL+i — IL | i = n, .., k — 1}. 
Finally, we have 



k-l . 

V — — (6xMQi ~ 6x i+1 MiQi) 



< K(T) sup 



IQi+i-Qil + IIIi+i-IIj 



By Proposition I3.5( the term on the right is decreasing to zero and 



sup A(n, T) >e\J n ) < u(n,e,T) |„ 0. 



This concludes the proof. 



4 Proof of Lemma 12.91 



In the following, L 1 ([0,T]) the set of all Legesgue-integrable functions from [0, T] to R m . Let 
H : [0,T] =4 R m be a set-valued map, such that, for any u £ [0,T], H{u) is a nonempty subset of 
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Definition 4.1. We call S(-ff) the set of integrable selections from [0,T] to R m : 

§{H) := {h G L\[0,T]) such that Vu G [0,T], %) G . 
M^ii/i swe/i a definition, we introduce the set-valued integral of H on [0, T]: 

J H{u)du := |y /i(u)dit | /i G §(if)| . 

-ff is said to be measurable if its graph {(t, x) \ x G H(t)} is measurable and integrally bounded if 
there exists an integrable function /i : [0, T] — > R + such that 

sup < h(t), Vi G [0,T]. 

Let h G S(iT). We call $ h the map defined by t G [0, T] i-> /J" h{u)du. 
The following theorem is due to Aumann [2] 

Theorem 4.2. Let H be a set-valued map on [0,T] wii/i nonempty images. Then 

* Jj T , H(u)du is convex, 

* If H is measurable and integrably bounded then Jj T , H(u)du is nonempty. 

* If H has closed images then J, 0T -^H(u)du is compact. 

* If (Hk)k is a sequence of set-valued functions uniformly integrally bounded then 

limsup / Hk{u)du C / \m\s\yp Hk{u)du, 

k J[0,T] J[0,T] k 

where x G Umsup fc if and only if every neighborhood of x intersects infinitely many Aj~. 

The next proposition is not a direct consequence of these results. However, the proofs of third and 
fourth points can be adapted to derive it: 

Proposition 4.3. Let {H n ) be a sequence of set-valued map from [0, T] to K m uniformly integrally 
bounded and H be a set-valued map with non empty images. We assume that, for any r G [0,T], 
lim sup„ H n (r) C H(t). For any n G N, let h n G §(H n ). Then, 

(i) If, for any u, H(u) is convex, there exists h G §(H) such that h n converges weakly in L 1 ([0, T]) 
to h (up to a subsequence) . In particular, ^h„ converges simply to ^h- 

(ii) Without the convexity assumption, there exists a function h on [0, T] with the property that, 
for any u G [0, T], h(u) G co(H(u)) and such that h n converges weakly in L 1 ([0,T]) to h. In 
particular, ^h„ converges simply to ^h- 

Proof. Since the sequence (H n ) is uniformly integrally bounded, the h n are all bounded by an 
integrable function g : [0,T] — > R + . Then there is a subsequence of (h n ) with a weak limit 
h G L 1 ([0, T]) (See Dunford and Schwartz [13], Theorem IV. 8. 9). We may assume without loss of 
generality that (h n ) actually converges weakly to h. There remains to check that h belongs to the 
set §{H). 

For A C L 1 ([0,T]), we note co(A) the convex hull (i.e. the smallest convex set containing A) and 
cd(A) the smallest closed (for the L 1 norm) convex set containing A. 
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Recall that, by Mazur's theorem, a convex subset of a i 1 ([0, T}) is closed if and only if it is weakly 
closed. Consequently, let k G N. the set co((h n ) n >k) is closed and convex and therefore weakly 
closed. Hence it contains h, which belongs to the weakly closed convex hull of (h n ) n >k- Finally, 



h G C0((h n ) n > k ) = C0((h n ) n >k), 

which means that there exists gk G co((h n ) n >k) such that \\h — gkWh 1 < 1/k. Finally, (gk) converges 
to h in L 1 and we may assume without loss of generality that (gk)k converges to h almost everywhere 
on[0,T]. 

From Caratheodory's theorem, the convex hull of a set A is the set of all barycenters of families of 
m+ 1 elements of A. Consequently, for any k and u G [0,T], since (h n (u)) n >k is a set of points in 
R m , we have 

m 

9k{u) = ^\ 3 k {u)e> k (u), 

3=0 

where \ k {u) > 0, Y^jLo ^i( u ) = 1 an d e i( u ) e {h n {u) \ n>k}. By compacity, we may assume 
that, for any j, {e k (u)) k converges to some e 3 (u) and (X J k (u)) k ) converges to A J (w), such that 
Xj (u) > and £™ Xj (") = x - Finally, 

m 

h(u) — \imgk(u) = X j (u)e^(u). 

k ' J 

3=0 

For any j, since e J (u) belongs to the limit set of the sequence (h n (u)) n , it belongs to H(u). Hence, 
h(u) belongs to co{H{u)) and, if H (u) is convex, h(u) G H(u). The proof is complete. ■ 

Lemma 4.4. Let (K, d) be a compact metric space and (A„)„, A : K K be set-valued maps such 
that A is standard and Fix(A) = {x G K \ x G A(x)} 0- Assume that, for any x G K , 

* A n+ i(x) c A n (x), 

* lim„ x n = x => limsup„ A n (x n ) C A(i). 

Then, for all S > 0, iftere exists e > and no G N SMc/i iftai /or all n > uq 

d(x,A n (x)) < e =>• Fzx(A)) < <5. 

Proof. First notice that, since Fix(A) is non empty and A n+ i(a;) C A n (x), there exists x € K such 
that a; G A n (x) for all n. Assume that there are 8 > 0, a strictly increasing sequence of integers 
{rik)k>i and a sequence (xk)k>i in K such that 

d(x kl A nk (x k )) < and d(x k , Fix(A)) > S. 

Then, there exists a sequence (yk)k>i such that yk G A nk {xk) and 

d{x k ,yk) < ^ and d(x k ,Fix(A)) > S. 

Without loss of generality, we may assume that x k — > £ G K and j/fc — > y. Consequently, 
d(x, Fix(A)) > S > and x = y. On the other hand, 

y G limsupA„ fc (a; fe ) C limsup A fc (x /£ ) C A (a;), 
fe fe 

which means that x G A(x), a contradiction. ■ 
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Remark 4.5. If the A n are closed, then it is sufficient to assume that n„A„(x) = A(x), Vx. Indeed, 
by monotonicity and the fact that A„ fc is closed, y £ A nk (x), Vfc. 

From now, we consider a bounded standard set-valued map F : M m M. m and T > 0. Let S be a 
positive real number. Then F s is the set-valued map defined by (J6j) and A s is the set-valued map 
defined by JgJ, extended to 6 ([0, T], R m ). Note that, with our current notations, the definition of 
the set- valued map A 5 can be written, for S > 0: 

A s : 6 ([0,T],M m ) =4 6 ([0,T],R m ) , z h-> { z (0) + * h | h £ §(F 5 (z))} . 

Proposition 4.6. A is a closed set-valued map with non empty images. 

Proof. First, A has non empty images since J Q T F(z) is non empty, for any z (see Theorem 14, 2p , 
Let (z n )„ be a sequence of 6([0, T], R m ), which converges to some z in e([0,T],K m ) and let (y n )n 
be a sequence converging to y such that, for all n £ N, y„ £ A(z n ). This implies that, 

Vn £ N, 3h n £ S(F(z n )) such that y n (r) = z n (0) + */,„(r). 

We call H n := F(z n ) and H := F(z). By assumptions we made on F, H n and H have compact, 
convex and nonempty values. For t £ [0,T], z n (r) converges to z(r) and, since the graph of F is 
closed, 

limsup77„(r) = limsupF(z n (r)) C F(z(r)) = -ff(r). 

n n 

The assumptions of Proposition 14.31 are satisfied. Hence, there exists h £ S(F(z)) such that y = 
z(0) + ^. ■ 

Lemma 4.7. For any S > 0, F s is a closed set-valued map with non empty images. 

Proof. Let x £ W 1 . F(x) is contained in F s . Hence, it is not empty. Let x n — > x and (y n ) n be a 
sequence of F s (x n ) converging to some y. Then there exists a sequence (z n ) n such that 

d(z ni x n ) < S and d(y n ,F(z n )) < S. 

hence there exists a sequence (a n ) n such that a n £ F(z n ) and d(y n ,a n ) < S. Without loss of 
generality, we may assume that z n — » z and a n — » a. By closeness of the graph of F, we obtain 

d(z, x) < 6, a £ and d(y,a) < 6, 

and i 7 " 5 is closed (and, in particular, has closed images). ■ 

Remark 4.8. Note that the images are, a priori, not convex. 

Lemma 4.9. Let (x n ) n be a sequence ofM™ 1 , converging to x and (S n ) n be a positive, vanishing 
sequence. Then we have 

limsupF 15 "^) C F(x). 

n — >+oo 

Proof. Let y £ limsup„ F Sn (x n ). By definition, there exists a sequence (y n ) which converges to y 
and such that y n £ F Sn (x n ) (actually it is a subsequence but there is no loss of generality to keep 
the initial indexation). Hence there exists a sequence (z n ) such that 

d(z n ,x n ) < S n , d(y n ,F(z n )) < 6 n , 

which means that d(x n ,a n ) < S n for some sequence (a n ) n satisfying a n £ F(z n ). Without loss of 
generality we may assume that a n — > a and z n — > z. Hence we have y = a £ F(z) = F(x). ■ 
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Corollary 4.10. Let (z n )„ be a sequence converging to z in C([0, T],R m ). Then we have, for any 
positive, vanishing sequence (5 n ) n , 

lira sup A 5 " (z n ) C A(z). 

n — >+oo 

Proof. Let y € limsup n A <Sr, (z n ). This means that there exists a sequence y n £ A <5 "(z n ) which 
converges to y. Hence, for all n £ N, there exists /i„ £ $(F Sn (z n )) such that 

Vr 6 [0,T], y n (r) = z n (0) + f h n {u)du 

Jo 

By Corollary 14.31 and Lemma l4~9l there exists a function ft on [0, T] such that 

r 

h n (u)du — > n / h(u)du, Vre[0,T] 
Jo 

and ft £ S(F(z)), which completes the proof. ■ 



Corollary 4.11. Let A„ := A 5 ". Suppose there exists a compact K C C([0,T],IR m ) s«e/i iftai 
A„ : K =| ii" /or all n. Then, for all S > 0, there exists e > and no £ N swe/i i/iat, /or aZZ n > no, 

d(z,A„(z)) < e =>■ d(z } Fia;(A)) < <5. 
Proof. This result follows from Lemma [4T4l and Corollary 14. 101 ■ 
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